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DEVICE PREDICTING A BRANCH OF AN INSTRUCTION 
EQUIVALENT TO A SUBROUTINE RETURN AND A METHOD 
THEREOF 

5 Background of the Invention 
Field of the Invention 

The present invention relates to an information 
processing device having a branch predicting 
mechanism and more particularly, to a branch 
10 predicting device predicting a branch of an 
instruction equivalent to a subroutine return in an 
architecture for which a particular instruction for 
a subroutine return is not prepared. 

15 Description of the Related Art 

For a conventional instruction processing 
device, its performance is attempted to be improved 
by sequentially starting the execution of succeeding 
instructions without waiting for the completion of 

20 the execution of one instruction by using the 
techniques such as pipeline processing, out-of-order 
processing , etc . 

In the pipeline processing, if a preceding 
instruction is an instruction which changes the 

25 execution sequence of succeeding instructions, such 
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as a branch instruction, the instruction at a branch 
destination must be entered to an execution pipeline 
when a branch is taken. Otherwise, the execution 
pipeline falls into disorder, and on the contrary, 
5 the performance is degraded in the worst case. 

Accordingly, attempts are made to improve the 
performance by arranging a branch predicting 
mechanism, a representative of which is a branch 
history (branch prediction table), and by predicting 

10 whether or not a branch is taken. If it is predicted 
in such a device that a branch is taken, the 
instruction at a branch destination is entered to an 
execution pipeline after a branch instruction. 
Therefore, the execution pipeline never falls into 

15 disorder when the branch is actually taken. 

Additionally, the branch destination (return 
destination) of a subroutine return instruction may 
vary at each execution from the nature of the 
instruction itself. This is because the location of 

20 the subroutine call instruction being a subroutine 
call source differs at each execution. For such an 
instruction, it is known that performance can be 
improved by arranging a dedicated branch predicting 
mechanism called a return address stack. 

25 However, the above described conventional branch 
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predicting mechanism has the following problems. 

For some CPU (Central Processing Unit) 
architectures, particular instructions are not 
prepared beforehand as a subroutine call/return 
5 instruction pair. To improve the performance in such 
architectures by adopting a return address stack, the 
technique for dynamically extracting an instruction 
pair equivalent to a subroutine call/return from 
branch instructions to be executed, is required. 

10 However, whether or not an instruction is a 

subroutine call/return instruction is statically 
determined at the time of decoding in a conventional 
information processing device. Therefore, programming 
different from the interpretation by hardware is 

15 undesirable. In this case, once the correspondence of 
a call/return pair differs from an actual one by 
undesirable programming, succeeding branch 
destinations are erroneously corresponded in 
succession from the nature of the return address 

20 stack. The more the number of the stages of the 
return address stack is, the worse the performance 
becomes . 

Fig. 1 exemplifies a program including 
subroutine call/return instruction pairs used in such 
25 an architecture. 
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In this example, a subroutine SI is called by an 
instruction "balr 14, 15" in a main routine (Call 1), 
and another subroutine S2 is further called by an 
instruction "balr 15, 13" in the subroutine SI (Call 
2). Then, control is returned to the subroutine SI by 
a conditional return instruction "bcr 7, 15" (Return 
2 ) , and further returned to the main routine by an 
unconditional return instruction "bcr 15, 14" (Return 
1). 

Here, assume that the instruction processing 
device recognizes a particular operation code "balr" 
to be an instruction equivalent to a subroutine call, 
and an unconditional branch instruction "bcr 15, x" 
(x is arbitrary) including a particular operation 
code and operand to be an instruction equivalent to 
a subroutine return. 

In this case, an instruction "bcr 7, 15" in the 
subroutine S2 is not recognized to be an instruction 
equivalent to a subroutine return, and is overlooked. 
Accordingly, a conventional return address stack 
recognizes Return 1 to be the return corresponding to 
Call 2, and a branch prediction results in a failure. 
Actually, the correct return corresponding to Call 2 
is Return 2. 

Additionally, if the instruction processing 



device simply recognizes all of instructions 
including the operation code "bcr" to be an 
instruction equivalent to a subroutine return, "bcr 
4, 3" being a mere conditional branch instruction in 
5 the subroutine S2 is recognized to be the return 
corresponding to Call 2. Therefore, the return 
address stack is proved to erroneously recognize a 
call/return pair also in this case. 

As described above, in an information processing 
10 device comprising a return address stack, it is vital 
to recognize a correct subroutine call/return 
instruction pair when instructions are executed. 

Summary of the Invention 

15 An object of the present invention is to provide 

a branch predicting device which correctly recognizes 
an instruction equivalent to a subroutine return in 
an information processing device for which a 
particular instruction for the subroutine return is 

20 not prepared, and a method thereof. 

In a first aspect of the present invention, a 
branch predicting device comprises a storing circuit, 
a comparing circuit, and an identifying circuit. 

The storing circuit stores information 

25 specifying a return address of a subroutine when an 
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instruction equivalent to a subroutine call is 
detected. The comparing circuit makes a comparison 
between information specifying a branch destination 
address of an instruction which can possibly be an 
5 instruction equivalent to a subroutine return and the 
information specifying the return address, which is 
stored in the storing circuit, when the instruction 
which can possibly be the instruction equivalent to 
the subroutine return is detected, and outputs the 

10 result of the comparison. The identifying circuit 
identifies the instruction equivalent to the 
subroutine return, which corresponds to the above 
described instruction equivalent to the subroutine 
call based on the result of the comparison. 

15 In a second aspect of the present invention, a 

branch predicting device comprises a stack circuit, 
a push circuit, a comparing circuit, and an 
identifying circuit. 

The stack circuit stores the information 

20 specifying a return address of a subroutine. The push 
circuit pushes the information specifying the return 
address onto the stack circuit. 

The comparing circuit makes a comparison between 
information specifying a branch destination address 

25 of an instruction which can possibly be an 



instruction equivalent to a subroutine return and the 
information specifying the return address, which is 
stored in the top entry of the stack circuit, when 
the instruction which can possibly be the instruction 
5 equivalent to the subroutine return is detected, and 
outputs the result of the comparison. The identifying 
circuit identifies the instruction equivalent to the 
subroutine return, which corresponds to the above 
described instruction equivalent to the subroutine 

10 call based on the result of the comparison. 

In a third aspect of the present invention, a 
branch predicting device comprises a return address 
stack circuit, a comparing circuit, and an 
identifying circuit. 

15 The return address stack circuit stores the 

return address of a subroutine when an instruction 
equivalent to a subroutine call is detected. The 
comparing circuit makes a comparison between a branch 
destination address of an instruction which can 

20 possibly be an instruction equivalent to a subroutine 
return and the return address stored in the return 
address stack circuit, and outputs the result of the 
comparison. The identifying circuit identifies the 
instruction equivalent to the subroutine return, 

25 which corresponds to the above described instruction 



8 

equivalent to the subroutine call. 



Brief Description of the Drawings 

Fig. 1 is a schematic diagram showing a 
5 subroutine call/return instruction pair; 

Fig. 2A is a block diagram showing the principle 
of a branch predicting device according to the 
present invention; 

Fig. 2B shows an instruction code; 
10 Fi g« 3 is a block diagram showing the 

configuration of an instruction processing device; 

Fig. 4 is a schematic diagram showing the 
correspondence between a link stack and a return 
address stack; 

15 Fig. 5 is a schematic diagram showing the 

signals used by the instruction processing device; 





Fig. 


6 shows a first determining circuit; 




Fig. 


7 shows a registering circuit; 




Fig. 


8 shows a selecting circuit; 


20 


Fig. 


9 shows a first identifying circuit; 




Fig. 


10 shows a second identifying circuit; 




Fig. 


11 shows a second determining circuit; 




Fig. 


12 shows a controlling circuit; 




Fig. 


13 shows a latch circuit; 


25 


Fig. 


14 shows an invalidating circuit; 



9 

Fig. 15 shows a flag generating circuit; 
Fig. 16 shows an entry registered to a branch 
history; and 

Fig. 17 shows a third determining circuit. 

5 

Description of the Preferred Embodiments 

Preferred embodiments according to the present 
invention are hereinafter described in detail by 
referring to the drawings. 

10 Fig. 2A is a block diagram showing the principle 

of a branch predicting device according to the 
present invention. In a first aspect of the present 
invention, the branch predicting device comprises a 
storing circuit 1, a comparing circuit 2, and an 

15 identifying circuit 3. 

The storing circuit 1 stores information 
specifying a return address of a subroutine when an 
instruction equivalent to a subroutine call is 
detected. The comparing circuit 2 makes a comparison 

20 between information specifying a branch destination 
address of an instruction which can possibly be an 
instruction equivalent to a subroutine return and the 
information specifying the return address, which is 
stored in the storing circuit 1, and outputs the 

25 result of the comparison, when the instruction which 



can possibly be the instruction equivalent to the 
subroutine return is detected. The identifying 
circuit 3 identifies the instruction equivalent to 
the subroutine return, which corresponds to the above 
5 described instruction equivalent to the subroutine 
call, based on the result of the comparison. 

If an executed instruction (or an instruction to 
be executed) is an instruction which performs an 
operation equivalent to a subroutine call, the return 

10 address specified by that instruction or the 
information about the register storing the return 
address, etc. is stored in the storing circuit 1 as 
the information specifying the return address. 

If an executed instruction (or an instruction 

15 to be executed) can possibly be an instruction which 
performs an operation equivalent to a subroutine 
return, the branch destination address specified by 
that instruction or the information about the 
register storing a branch destination address, etc. 

20 is selected as the information specifying the branch 
destination address. Then, the comparison between the 
selected information and the information specifying 
the return address is made by the comparing circuit 
2. 

25 If the information specifying the branch 
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destination address and the information specifying 
the return address match, the identifying circuit 3 
identifies the latter instruction as an instruction 
equivalent to a subroutine return, which corresponds 
5 to the former. If they mismatch, the identifying 
circuit 3 identifies the latter instruction not as an 
instruction equivalent to a subroutine return, which 
corresponds to the former. 

By using the information specifying a return 

10 address of a subroutine as described above, a correct 
instruction pair equivalent to a subroutine 
call/return can be dynamically extracted. 
Accordingly, the correspondence of a call/return pair 
can be correctly recognized, thereby preventing the 

15 correspondence from being improperly made. 

In a second aspect of the present invention, the 
branch predicting device comprises a stack circuit 4, 
a push circuit 5, a comparing circuit 2, and an 
identifying circuit 3. 

20 The stack circuit 4 stores information 

specifying a return address of a subroutine. The push 
circuit 5 pushes the information specifying the 
return address onto the stack circuit 4 when an 
instruction equivalent to a subroutine call is 

25 detected. 
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The comparing circuit 2 makes a comparison 
between information specifying a branch destination 
address of an instruction which can possibly be an 
instruction equivalent to a subroutine return and the 
5 information specifying the return address, which is 
stored in the top entry of the stack circuit 4, and 
outputs the result of the comparison, when the 
instruction which can possibly be the instruction 
equivalent to the subroutine return is detected. The 

10 identifying circuit 3 identifies the instruction 
equivalent to the subroutine return, which 
corresponds to the above described instruction 
equivalent to the subroutine call based on the result 
of the comparison. 

15 When the instruction which performs an operation 

equivalent to the subroutine call is detected, the 
push circuit 5 pushes the information specifying the 
return address onto the stack circuit 4. When an 
instruction which can possibly be an instruction 

20 which performs an operation equivalent to the 
subroutine return is detected, the comparing circuit 
2 makes a comparison between the information 
specifying the branch destination address of that 
instruction and the information specifying the return 

25 address, which is pushed onto the stack circuit 4. 



13 

If the information specifying the branch 
destination address and the information specifying 
the return address match, the identifying circuit 3 
identifies the latter instruction as the instruction 
5 equivalent to the subroutine return, which 
corresponds to the former. If they mismatch, the 
identifying circuit 3 identifies the latter 
instruction not as the instruction equivalent to the 
subroutine return, which corresponds to the former. 

10 By pushing the information specifying a return 

address of a subroutine onto the stack circuit 4 as 
described above, the correspondence of a call/return 
pair can be correctly recognized in a similar manner 
as in the branch predicting device in the first 

15 aspect, thereby preventing the correspondence from 
being improperly made. 

In a third aspect of the present invention, the 
branch predicting device comprises a return address 
stack circuit 6, a comparing circuit 2, and an 

20 identifying circuit 3. 

The return address stack circuit 6 stores a 
return address of a subroutine when an instruction 
equivalent to a subroutine call is detected. The 
comparing circuit 2 makes a comparison between a 

25 branch destination address of an instruction which 



can possibly be an instruction equivalent to a 
subroutine return and the return address stored in 
the return address stack circuit 6, and outputs the 
result of the comparison, when the instruction which 
5 can possibly be the instruction equivalent to the 
subroutine return is detected. The identifying 
circuit 3 identifies the instruction equivalent to 
the subroutine return, which corresponds to the above 
described instruction equivalent to the subroutine 

10 call based on the result of the comparison. 

When the instruction which performs an operation 
equivalent to a subroutine call is detected, the 
return address specified by that instruction is 
pushed onto the return address stack circuit 6. Next, 

15 when the instruction which can possibly be an 
instruction which performs an operation equivalent to 
a subroutine return is detected, the comparing 
circuit 2 makes a comparison between the branch 
destination address of that instruction and the 

20 return address pushed onto the stack circuit 4. 

If the branch destination address and the return 
address match, the identifying circuit 3 identifies 
the latter instruction as an instruction equivalent 
to a subroutine return, which corresponds to the 

25 former. If they mismatch, the identifying circuit 3 
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identifies the latter instruction not as the 
instruction equivalent to the subroutine return, 
which corresponds to the former. 

By directly making a comparison between the 
5 return address pushed onto the return address stack 
circuit 6 and the branch destination address of an 
instruction as described above, the correspondence of 
a call/return pair can be correctly recognized in a 
similar manner as in the branch predicting device in 

10 the first aspect, thereby preventing the 
correspondence from being improperly made. 

For example, the storing circuit 1 and the stack 
circuit 4, which are shown in Fig. 2A, correspond to 
a link stack 33 and a return address stack 35, which 

15 are shown in Fig. 3 and will be described later. 

Additionally, for instance, the comparing circuit 2 
and the identifying circuit 3, which are shown in 
Fig. 2A, correspond to an EXNOR circuit 101, an OR 
circuit 102, and an AND circuit 103, which are shown 

20 in Fig. 11 and will be described later, or a 
comparing circuit 151 and an AND circuit 152, which 
are shown in Fig. 17 and will be described later. 
Furthermore, the push circuit 5 shown in Fig. 2A 
corresponds to a controlling circuit which is shown 

25 in Fig. 12 and will be described later, and the 
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return address stack circuit 6 shown in Fig. 2A 
corresponds to the return address stack 35 shown in 
Fig. 3. 

In an instruction processing device, a link 
5 register storing a return address is specified by an 
instruction equivalent to a subroutine call, and a 
branch by an instruction equivalent to a subroutine 
return is taken with the specified link register. 

The instruction equivalent to a subroutine call 

10 or return includes, for example, an operation (OP) 
code 11, a first operand 12, and a second operand 13 
as shown in Fig. 2B. In the instruction equivalent to 
a subroutine call, the first operand 12 represents 
the number of a link register. In the instruction 

15 equivalent to a subroutine return, the second operand 
13 represents the number of the register storing a 
branch destination address. 

In this preferred embodiment, a link stack 
registering the number of the link register specified 

20 at the time of a subroutine call is arranged. When a 
branch instruction that uses the address within the 
register having the number registered to the link 
stack as a branch destination address appears, this 
branch instruction is recognized to be an instruction 

25 equivalent to a subroutine return. 
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With such a control, instructions equivalent to 
subroutine call and return can be corresponded by- 
using the number of a link register as link 
information, so that it becomes possible to 
5 dynamically extract an instruction pair equivalent to 
a subroutine call/return. Accordingly, the 
correspondence of the call/return pair can be 
correctly recognized, and the correspondence can be 
prevented from being improperly made, whereby the 

10 accuracy of a branch prediction by the return address 
stack can be improved. 

For instance, in the example shown in Fig. 1, a 
correct call/return pair can be recognized by making 
the comparison between the number of the link 

15 register, which is included in a call instruction, 
and the number of the branch destination address 
register, which is included in a return instruction, 
which leads also to a successful branch prediction. 

The first operand of the instruction "balr 14, 

20 15" in Call 1 represents that the number of the link 
register is "14", while the second operand of the 
instruction "bcr 15, 14" in Return 1 represents that 
the number of the branch destination address register 
is "14". Accordingly, the latter instruction is 

25 recognized to be an instruction equivalent to a 
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return, which corresponds to Call 1. 

Furthermore, the first operand of the 
instruction "balr 15, 13" in Call 2 represents that 
the number of the link register is "15", while the 
5 second operand of the instruction "bcr 7, 15" in 
Return 2 represents that the number of the branch 
destination address register is "15". Accordingly, 
the latter instruction is recognized to be an 
instruction equivalent to a return, which corresponds 

10 to Call 2. 

Next, the operations of the information 
processing device in this preferred embodiment will 
be explained in detail by using an example of an 
architecture for which a particular subroutine 

15 call/return instruction pair is not prepared. Such an 
architecture is stipulated, for example, by POO 
(Principles Of Operation) of ESA (Enterprise Systems 
Architecture ) /390 . 

As an instruction available as a subroutine 

20 call, an instruction which can store in a register 
the return address ( link address ) used by an 
instruction equivalent to a subroutine return is 
considered. Examples of such an instruction include 
bal, balr, bas, basr, bassm, etc. 

25 Additionally, an instruction available as a 
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subroutine return, almost all of general branch 
instructions can be cited. Above all, a branch 
instruction specifying a branch destination address 
with one register, that is, an RR form instruction is 
5 apt to be used. Examples of the RR form instruction 
include bcr, bsm, etc. As a matter of course, these 
instructions are also used as a normal unconditional 
or conditional branch instruction. 

Furthermore, there is a possibility that an 

10 instruction which can possibly cause an improper 
correspondence of a subroutine call/return pair 
exists in such an architecture, although its 
appearance frequency is low. As such an instruction, 
by way of example, an RX form instruction such as 

15 lpsw, be, etc. can be cited. Also in some interrupt 
events, a subroutine call/return pair may be 
improperly corresponded in some cases. 

The branch instruction in an RX form, the 
representative of which is be, does not always 

20 specify the return address only with one register, 
and particularly, specifies a displacement in some 
cases. Besides, a return address may sometimes be 
changed by a process rewriting the value of the link 
register, etc. 

25 If such an instruction is used as a subroutine 
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return, the return address that is registered to the 
return address stack at the time of a call is not 
correct. Therefore, it is desirable not to reference 
the return address stack at the time of a return. 
5 Alternatively, a correct return address can possibly 
be obtained by referencing the predicted branch 
destination registered to a branch history, similar 
to a normal branch instruction. 

Furthermore, lpsw does not directly specify a 

10 branch destination address with a register, and uses 
the data sequence in a memory, which is indicated by 
an operand, as a branch destination address. When 
such an instruction sequence appears, the 
correspondence of a call/return pair may not be 

15 maintained properly. Or, also when an interrupt 
occurs, a call/return pair can possibly make an 
improper correspondence depending on the type of the 
interrupt in a similar manner. 

Accordingly, some mechanism must be embedded 

20 into a return address stack. As one way of embedding 
a mechanism, it is considered to erase all of the 
entries of a return address stack and a link stack 
when such instructions are executed or when such an 
interrupt occurs. With such a control, the 

25 correspondence of the return address stack can be 
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prevented from being improperly made, whereby the 
performance degradation due to daisy-chained improper 
correspondences of subsequent prediction results, 
which are triggered by an initial occurrence, never 
5 takes place. 

Furthermore, although fundamental branch 
instructions are implemented by hardwired, some 
branch instructions are sometimes controlled by 
microcode. This is because these instructions 

10 accompany other complicated operations. Such 
complicated branch instructions do not have an 
advantage of being registered to a branch history, 
since few benefits can be obtained despite the 
complexity of circuitry. For this reason, also a 

15 return address stack does not run. 

As described above, however, if such complicated 
instructions can possibly be an instruction 
equivalent to a subroutine call or return, the return 
address stack is improperly corresponded on the 

20 condition that no measures are taken to these 
instructions, which leads to a degradation of 
performance . 

Therefore, control is performed so that an 
instruction equivalent to a subroutine return is not 
25 recognized to be an instruction equivalent to a 



return in a branch history or a return address stack, 
when the instruction equivalent to the subroutine 
return, which is considered to correspond to a branch 
instruction equivalent to a subroutine call and 
5 unregistered to the branch history, is detected after 
the branch instruction is executed. 

In addition, a particular register is used as a 
link register very frequently in some cases, for 
example, in the case where a particular register is 

10 recommended to be used as a link register by a 
programming guide, etc. In such a system, it is 
assumed that the instruction using the particular 
register is always recognized to be an instruction 
equivalent to a subroutine call or return. In this 

15 way, the entries of a link stack can be efficiently 
used, whereby a great effect can be obtained even 
with a small-scale link stack. 

Furthermore, if "0" is specified as the register 
number of a branch destination address in a branch 

20 instruction, the branch is not taken. In such an 
architecture, it is impossible to determine a 
corresponding instruction equivalent to a subroutine 
return by using the register number "0" as link 
information. Accordingly, if "0" is specified as the 

25 number of the link register in the instruction 
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equivalent to a subroutine call, this instruction is 
not recognized to be an instruction equivalent to a 
subroutine call. 

Fig. 3 is a block diagram showing the 
5 configuration of an instruction processing device in 
this preferred embodiment. The instruction processing 
device shown in Fig. 3 comprises an instruction 
fetching circuit 21, a branch predicting mechanism 
22, a decoder 23, a branch destination address 

10 generating circuit 24, a branch instruction execution 
processing circuit 25, and an instruction execution 
completion processing circuit 26. This device 
executes instructions with an out-of-order method. In 
the instruction processing device adopting the out- 

15 of -order method, succeeding instruction sequences are 
sequentially entered to a plurality of pipelines 
without waiting for the completion of the execution 
of one instruction in order to improve its 
performance . 

20 The instruction fetching circuit 21 and the 

branch predicting mechanism 22 corresponds to the 
circuit of an instruction fetch pipeline. The branch 
predicting mechanism 22 comprises a predicting 
circuit 31, a comparing circuit 32, and a link stack 

25 33. The predicting circuit 31 comprises a branch 
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history 34, and a return address stack 35. 

The decoder 23, the branch destination address 
generating circuit 24, the branch instruction 
execution processing circuit 25, and the instruction 
5 execution completion processing circuit 26 correspond 
to the circuit of an instruction execution pipeline. 
The branch instruction execution processing circuit 
25 comprises a plurality of RSBRs (Reservation 
Stations for BRanch) 36. 

10 The instruction fetch pipeline has an 

instruction address issuance cycle (IA), a table 
cycle (IT), a buffer cycle (IB), and a result cycle 
(IR). The instruction execution pipeline has a decode 
cycle (D), an address calculation cycle (A), an 

15 execution cycle (X), an update cycle (U), and a write 
cycle ( W ) . 

The RSBR 36 is a stack waiting for the process 
intended for controlling a branch instruction. The 
branch instruction execution processing circuit 25 
20 can select an entry which can be processed in the 
stack, and can execute a branch instruction whenever 
necessary in an order different from that instructed 
by a program. 

Among the branch instructions handled by the 
25 RSBR 36, bal, balr (except for balr 1, 14), bras, 



25 

bas, and basr are handled as an instruction 
equivalent to a subroutine call, while bcr, bsm, and 
balr 1, 14 are handled as an instruction equivalent 
to a subroutine return. Although bassm is an 
5 instruction equivalent to a subroutine call, it is a 
complicated instruction which is not handled by the 
RSBR 36. 

If a branch is proved to occur as a result of 
the execution of a branch instruction by the branch 

10 instruction execution processing circuit 25, the 
instruction address at the branch destination and the 
address of the branch instruction itself are 
registered to the branch history 34 as a pair. The 
instruction fetching circuit 21 searches the branch 

15 history 34 prior to the fetch of the next instruction 
and predicts a branch destination, at the time of 
fetching a branch instruction. 

When the decoder 23 detects an instruction 
equivalent to a subroutine call, the number of the 

20 link register, which is represented by the operand of 
that instruction, is pushed onto the link stack 33, 
and the instruction address at a corresponding return 
destination is pushed onto the return address stack 
35. 

25 When the decoder 23 detects an instruction which 
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can possibly be an instruction equivalent to a 
subroutine return, the comparing circuit 32 makes a 
comparison between the register number registered to 
the top entry of the link stack 33, and the number of 
5 the branch destination address register, which is 
represented by the operand of the detected 
instruction. If these two register numbers match, the 
comparing circuit 32 determines that the detected 
instruction is an instruction which performs an 

10 operation equivalent to a subroutine return, and 
outputs the result of the comparison to the 
predicting circuit 31. 

At this time, the register number is popped from 
the link stack 33, and the corresponding instruction 

15 address is popped from the return address stack 35. 

The popped instruction address is passed to the 
instruction fetch circuit 21 as a predicted branch 
destination. 

The entries of the link stack 33 correspond to 
20 those of the return address stack 35 one by one as 
shown in Fig. 4. These two stacks perform push and 
pop operations at the same time. Here, a 4-bit 
register number <0:3> is stored in the entry of the 
link stack 33, while a 32-bit branch destination 
25 address <0:31> is stored in the entry of the return 
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address stack 35- These stacks are normally arranged 
as n-stage stacks composed of "n" (n>l) entries. 

Fig. 5 shows the signals used in the instruction 
processing device shown in Fig. 3. The decoder 23 
5 outputs signals +DBALR, +D_BAL, +D_BRAS, +D_BAS, 
+D_BASR, +D_BALR_1E, +D_BCR, +D_BSM, +DBASSM, and 
+D_OPC<8:15> to the branch instruction execution 
processing circuit 25. 

The signals +DBALR, +DBAL, +D_BRAS, +D_BAS, 

10 +DBASR, +D_BALR_1E, +DBCR, +DJBSM, and +DBASSM 

respectively become a logic "1" when balr, bal, bras, 
bas, basr, balr 1, 14, bcm, and bassm are detected. 
The signal +D_OPC<8:15> represents the data of the 
bits of a machine language instruction. 

15 The branch instruction execution processing 

circuit 25 outputs signals 
+ BRHIS_UPDATE_SUBROUTINE_CALL , 
+ BRHIS_UPDATE_SUBROUTINE_RTN, 
+BRHIS_UPDATE_CALL_RTN_REG<0 : 3> , +BRHIS_UPDATE_BSM, 

20 and +DBASSM to the branch predicting mechanism 22. 

The signal +BRH I S_UPDATE_SUBROUT I NE C A L L becomes 
a logic "1" when an instruction is determined to be 
an instruction equivalent to a subroutine call. The 
signal + BRH I S_UPD AT ESUBROUT I NE_RT N becomes a logic 

25 "1" when an instruction is determined to be an 
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instruction which can possibly be an instruction 
equivalent to a subroutine return. The signal 
+BRHIS_UPDATE_CALL_RTN_REG<0:3> represents the number 
of the register specified by an instruction operand. 
5 The signal + BRH I S_UPD AT E_B SM becomes a logic "1" upon 
completion of the execution of the bsm instruction. 

Next, the configuration and the operations of 
the instruction processing device shown in Fig. 3 are 
explained in detail by referring to Figs. 6 to 17. 

10 When an instruction is decoded by the decoder 

23, the signals shown in Fig. 5 are input to the RSBR 
36, and an instruction equivalent to a subroutine 
call and an instruction which can possibly be an 
instruction equivalent to a subroutine return are 

15 determined. For the instruction which can possibly be 
the instruction equivalent to the subroutine return 
among them, a more strict correspondence with a 
subroutine return is identified by the circuit of the 
link stack 33, which will be described later. 

20 Fig. 6 shows a determining circuit within the 

RSBR 36. In this figure, an input signal -D_BALR_1E 
represents the negation of the signal +D_BALR_1 E 
shown in Fig. 5, and becomes a logic "0" when the 
instruction "balr 1, 14" is decoded. An AND circuit 

25 41 outputs the logical product of the input signals 
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+D_BALR and - DBA LR_1 E to an OR circuit 42. 
Accordingly, an instruction balr other than "balr 1, 
14" are decoded, the output of the AND circuit 41 
becomes a logic " 1 " . 
5 The OR circuit 42 outputs the logical sum of the 

output signal from the AND circuit 41 and the input 
signals +DBAL, +DBRAS, +D_BASR , and +D_BAS as a 
signal +DSUBROUT I NE_CALL . This signal 
+DSUBROUTINECALL is used as a flag which becomes a 

10 logic "1" if a decoded instruction is an instruction 
equivalent to a subroutine call. 

Additionally, an OR circuit 43 outputs the 
logical sum of the input signals +D_BALR_1E, +D_BCR, 
and +D_BSM as a signal +D_SUBROUTINE_RETURN. This 

15 signal + D_SUBR0UT I NE_RE TURN is used as a flag which 
becomes a logic "1" if a decoded instruction is an 
instruction which can possibly be an instruction 
equivalent to a subroutine return. 

If a decoded instruction is a branch 

20 instruction, the decoding result is normally 
registered to the RSBR 36. At this time, the flag 
representing the result of the determination of a 
subroutine call/return, and the information of a link 
register or a branch destination address register, 

25 etc. are registered to the RSBR 36. 
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With the architecture of ESA/390 POO, the number 
of the link register is specified in the bits <8:11> 
of an instruction (machine language instruction) 
which can possibly be an instruction equivalent to a 
5 subroutine call, and the number of the branch address 
register is specified in the bits <12:15> of an 
instruction (machine language instruction) which can 
possibly be an instruction equivalent to a subroutine 
return. Accordingly, the data of the bits <8:15> is 

10 registered as the information of these registers. 

Fig. 7 shows a registering circuit within the 
RSBR 36. In this figure, an input signal +RSBRVALID 
becomes a logic "1" while the corresponding RSBR 36 
is valid. A latch circuit 51 latches the value of the 

15 input signal +D_0PC<8 : 15> , and outputs the latched 
value as a signal +RSBR_0PC<8 : 15> . 

A latch circuit 52 latches the value of the 
flags +D_SUBROUTINE_CALL and +DSUBROUTINERETURN, 
which are generated by the determining circuit shown 

20 in Fig. 6, and outputs the latched values 
respectively as signals +RSBR_SUBROUTINE_CALL and 
+RSBR_SUBROUT INE_RETURN . 

When the signal +RSBR_VALID becomes a logic "1", 
the registration of the information is terminated. 

25 The information registered to the latch circuits 51 
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and 52 is preserved while the corresponding RSBR 36 
is valid. 

Next, the subroutine call/return determination 
result and the register information registered to the 
5 RSBR 36, are transmitted to the branch predicting 
mechanism 22 simultaneously with the other branch 
history information, when the branch history 
information is updated. If the instruction is an 
instruction equivalent to a subroutine call, the 

10 number of the link register is selected as the 
register information. If the instruction is an 
instruction which can possibly be an instruction 
equivalent to a subroutine return, the number of the 
branch destination address register is selected as 

15 the register information. 

Fig. 8 shows a selecting circuit within the RSBR 
36. In this figure, an AND circuit 61 outputs to an 
OR circuit 63 the logical product of the signals 
+RSBR_SUBROUT I NE_CAL L and +RSBR_OPC<8 : 11> from the 

20 registering circuit shown in Fig. 7. Accordingly, the 
number of the link register is output from the AND 
circuit 61 when the flag +RSBR_SUBROUTINE_CALL is 
set. 

An AND circuit 62 outputs the logical product of 
25 the signals +RSBR_SUBROUTINE_RETURN and 
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+RSBR_0PC<12: 15> from the registering circuit shown 
in Fig. 7 to the OR circuit 63. Accordingly, the 
number of the branch destination address register is 
output from the AND circuit 62 when the flag 
5 +RSBR_SUBROUTINE_RETURN is set. 

Then, the OR circuit 63 outputs the logical sum 
of the output signals from the AND circuits 61 and 62 
as a signal +RSBR_CALL_RETURN_REG<0 : 3> . Here, since 
the flags + R S B R_ S U B R 0 U T I N E_C A L L and 

10 +RSBR_SUBROUTINE_RETURN are never set at the same 
time, the OR circuit 63 selectively outputs the 
output signals from the AND circuits 61 and 62. 

The signals +RSBRSUBROUT I NECALL , 

+ RSBR_SUBROUTINE_RETURN, and 

15 +RSBR_CALL_RETURN_REG<0:3> are output to the branch 
predicting mechanism 22 respectively as the signals 
BRHIS_UPDATE_SUBROUTINE_CALL , 
B R H I S _ U P D A T E _ S U B R 0 U T I N E _ R T N , and 
+BRHIS_UPDATE_CALL_RTN_REG<0:3>, which are shown in 

20 Fig. 5. 

In the meantime, as described above, a branch is 
not taken if "0" is specified as the number of the 
branch destination address register in branch 
instructions ( including an instruction equivalent to 

25 a subroutine return). Inversely, if "0" is specified 
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as the number of the link register even in an 
instruction determined to be an instruction 
equivalent to a subroutine call when being decoded, 
it is desirable not to identify this instruction as 
5 an instruction equivalent to a subroutine call. 

Therefore, a control signal which becomes valid 
only if a transmitted register number is not "0" is 
generated by arranging an identifying circuit shown 
in Fig. 9 within the branch predicting mechanism 22. 

10 In Fig. 9, a NAND circuit 71 obtains the logical 
product of the negation of the four bits of the 
signal +BRHIS_UPDATE_CALL_RTN_REG<0 : 3> , and outputs 
the negation of the logical product as a signal 
+SBRTN_LINK_REG_VAL . 

15 Accordingly, this output signal becomes a logic 

"1" only if the register number represented by the 
signal +BRHIS_UPDATE_CALL_RTN_REG<0 : 3> is not "0", 
which represents that the link register is valid. 
Control of the link stack 33 with this signal will be 

20 described later. 

Even if a particular number other than "0" is 
used as the number of the branch destination address 
register, which represents a branch instruction by 
which a branch is not taken, a similar control signal 

25 is generated with a circuit similar to that shown in 
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Fig. 9. 

Furthermore, since the bassm instruction 
available as a subroutine call is implemented not by 
hardwired but by a microcode, this is not registered 
5 to the branch history 34 and its information is not 
transmitted when the branch history information is 
updated. Alternatively, the signal +D_BASSM which is 
shown in Fig. 5 and generated at the time of decoding 
is transmitted to the branch predicting mechanism 22. 

10 Therefore, control for the bassm instruction is 

performed by arranging an identifying circuit shown 
in Fig. 10 in the branch predicting mechanism 22. 
Here, the return instruction corresponding to the 
bassm instruction is assumed to be only bsm. 

15 In Fig. 10, an AND circuit 81 outputs the 

logical product of the output of a latch circuit 83 
and that of a NAND circuit 84 to an OR circuit 82. 
The OR circuit 82 outputs the logical sum of the 
input signal +D_BASSM and the output signal of the 

20 AND circuit 81 to the latch circuit 83. The latch 
circuit 83 substantially performs the operations of 
a set/reset flip-flop, latches the output signal of 
the OR circuit 82, and outputs the latched signal to 
the NAND circuit 84. 

25 The NAND circuit 84 outputs the negation of the 
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logical product of the signal + BRH I S_UPD AT E_BSM shown 
in Fig. 5, the control signal + SBRTN_L I NK_REG_VAL 
shown in Fig. 9, and the output signal of the latch 
circuit 83 as a signal -SBRTN_BASSM_BSM_RTN_VALID. 
5 This signal -SBRTNBASSMBSMRTNVALID represents 
that the executed bsm instruction is the return 
instruction corresponding to the above described 
bassm instruction if it is a logic "0". 

With such an identifying circuit, if a bassm 

10 instruction to be branched is executed, the signal 
+DBASSM becomes a logic "1" and also the output of 
the latch circuit 83 becomes a logic "1". When the 
signal +BRHIS_UPDATE_BSM becomes a logic "1 upon 
completion of the execution of the bsm instruction 

15 while the output of the latch circuit 83 and the 
signal +SBRTN_LINK_REG_VAL shown in Fig. 9 are a 
logic "1", the executed bsm instruction is identified 
as the return instruction corresponding to the above 
described bassm instruction. 

20 Because the signal -SBRTN_BASSM_BSM_RTN_VALID 

becomes a logic "0" at this time, also the output of 
the AND circuit 81 becomes a logic "0". Since also 
the signal +DBASSM is a logic "0", the output of the 
latch circuit 83 also becomes a logic "0". 

25 As described above, the output signal of the 
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latch circuit 83 is used as a predetermined flag 
which represents that the bassm and the bsm 
instructions are detected. This flag is set when a 
bassm instruction to be branched is detected, and is 
5 reset when the corresponding bsm instruction is 
detected. 

Furthermore, also a signal 
+SBRTN_BASSM_BSM_RTN_VALID not shown is generated 
simultaneously with the signal 

10 SBRTN_BAS S M_B S M_R T N_ V A L ID. This signal 

+SBRTN_BASSM_BASM_RTN_VALID corresponds to the 
negation of the signal -SBRTN_BASSM_BSM_RTN_VALID, 
and represents that an executed bsm instruction is 
the return instruction corresponding to the above 

15 described bassm instruction if it is a logic "1". 

Thus identified bsm instruction corresponding to 
the bassm instruction is no longer recognized to be 
an instruction equivalent to a return in the branch 
history 34 or on the return address stack 35. This is 

20 because the bassm instruction itself is not 
registered as an instruction equivalent to a call. 

The branch predicting mechanism 22 determines 
instructions equivalent to subroutine call/return 
with the signals transmitted from the branch 

25 instruction execution processing circuit 25 and the 
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particular control signals generated by the 
identifying circuits shown in Figs. 9 and 10. 

Fig. 11 shows a determining circuit within the 
branch predicting mechanism 22. In this figure, an 
5 input signal - BRH I S_UPD ATE_SUBROUT I NERTN corresponds 
to the negation of the signal 
+ BRH I S_UP DAT E_SUBR0UT I NE RTN shown in Fig. 5. 

An input signal +RTN_LINK_REG_STK0 < 0 : 3 > 
represents the register number stored in the top 

10 entry of the link stack 33. An input signal 
+SBRTN_LINK_REG_EQ_E becomes a logic "1" if the 
signal +BRHIS_UPDATE_CALL_RTN_REG<0 : 3> shown in Fig. 
5 represents the register number "14", and becomes a 
logic "0" if the signal 

15 +BRHIS_UPDATE_CALL_RTN_REG<0:3> represents the other 
numbers . 

An AND circuit 91 outputs to an AND circuit 92 
the logical product of the signal 
+BRH I S_UPDAT ESUBRO UT I NE_C ALL shown in Fig. 5, and 

20 the signal +SBRTNLINKREGVAL shown in Fig. 9. The 
AND circuit 92 outputs the logical product of the 
output signal of the AND circuit 91 and the signal - 
BRHI S_UPDATE_SUBROUT I NE_RTN as a signal 

+BR_COMP_SUBROUTINE_CALL . 

25 This signal +BR_COMP_SUBROUTINE_CALL is used as 



38 

a flag which represents an instruction equivalent to 
a subroutine call (a subroutine call flag) in the 
branch predicting mechanism 22. If this flag is a 
logic "1", the instruction executed by the branch 
5 instruction execution processing circuit 25 is 
determined to be an instruction equivalent to a 
subroutine call. If the executed instruction 
specifies the register having the number "0" as a 
link register, this flag becomes a logic "0" and the 

10 instruction is determined not to be an instruction 
equivalent to a subroutine call . 

An EXNOR circuit 101 makes a comparison between 
the signal +BRHIS_UPDATE_CALL_RTN_REG<0 : 3> shown in 
Fig. 5 and the signal +RTN_LINK_REG_STKO<0 : 3> , and 

15 outputs the negation of the exclusive logical sum of 
the two signals. An OR circuit 102 outputs the 
logical sum of the output signal of the EXNOR circuit 
101 and the signal +SBRTN_LINK_REG_EQ_E . 

Then, an AND circuit 103 outputs the logical 

20 product of the signal +BRH I S_UPDATE_SUBROUT I NE_RTN 
shown in Fig. 5, the signal + SBRTN LI NK_REG_VAL shown 
in Fig. 9, the signal -SBRTN_BASSM_BSM_RTN_VALID 
shown in Fig. 10, and the output signal of the OR 
circuit 102 as a signal +BR_COMP_SUBROUTINE_RTN. 

25 This signal +BR_COMP_SUBROUTINE_RTN is used as 
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a subroutine return (a subroutine return flag) in the 
branch predicting mechanism 22. If this flag is a 
logic "1", the instruction executed by the branch 
5 instruction execution processing circuit 25 is 
determined to be an instruction equivalent to a 
subroutine return. This determination operation is 
performed before the corresponding branch history 
information is registered to the branch history 34 or 

10 the return address stack 35- 

The subroutine return determining circuit 
composed of the EXNOR circuit 101, the OR circuit 
102, and the AND circuit 103 corresponds to the 
comparing circuit 32 shown in Fig. 3. With this 

15 determining circuit, the number of the branch 
destination address register in the executed 
instruction which can possibly be an instruction 
equivalent to a subroutine return is compared with 
the top entry of the link stack 33. If they match, the 

20 executed instruction is determined to be an 
instruction equivalent to a subroutine return. 

Note that, however, the bsm instruction 
corresponding to the bassm instruction is not handled 
as an instruction equivalent to a return in the 

25 branch predicting mechanism 22 as described above. 
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Therefore, the output of the AND circuit 103 is 
suppressed by the signal -SBRTN_BASSM_BSM_RTN_VALID. 

Furthermore, the register having the number "14" 
is customarily used as a branch destination address 
5 register in a subroutine return in many cases . 
Therefore, if this register is used as the branch 
destination address register, an executed instruction 
is determined to be an instruction equivalent to a 
subroutine return with the signal 

10 +SBRTN_LINK_REQ_EQ_E regardless of the result of the 
comparison made by the EXNOR circuit 101 . 

Also if a particular number other than "14" is 
used as the number of the branch destination address 
register, which represents an instruction equivalent 

15 to a subroutine return, similar control is performed 
by a circuit similar to that shown in Fig. 11. 

The link stack 33 performs push and pop 
operations by the control circuit shown in Fig. 12 
with thus generated subroutine call and return flags. 

20 Here, it is assumed that the link stack 33 is 
composed of two entries, and the input signals 
+RTN_LINK_REG_STK0<0:3> and +RTN_LINK_REG_STK1 <0 : 3 > 
respectively represent the register numbers stored in 
the first entry ( top entry 0 ) and the second entry 

25 (entry 1) . 
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An input signal -SBRTN_LINK_REG_EQ_E corresponds 
to the negation of the signal +SBRTN_LINK_REQ_EQ_E 
shown in Fig. 11. An input signal +BRHIS_UPDATE_TAKEN 
becomes a logic "1" when a branch by a branch 
5 instruction is taken and branch history information 
is updated. 

First of all, an AND circuit 111 outputs the 
logical product of the above described two signals. 
An AND circuit 112 outputs the logical product of the 

10 flag +BR_COMP_SUBROUTINE_CALL shown in Fig. 11 and 
the output signal of the AND circuit 111 as an 
operation signal +PUSH_RTN_STACK_LINK_REG. This 
signal is used to instruct the push operations of the 
link stack 33 and the return address stack 35, and 

15 becomes a logic "1" when an instruction equivalent to 
a subroutine call is executed and the branch history 
information is updated. 

An AND circuit 113 outputs the logical product 
of the flag +BR_C0MP_SUBR0UTINE_RTN shown in Fig. 11 

20 and the output signal of the AND circuit 111 as an 
operation signal +POP_RTN_STACK_LINK_REG. This signal 
is used to instruct the pop operations of the link 
stack 33 and the return address stack 35, and becomes 
a logic "1" when an instruction equivalent to a 

25 subroutine return is executed and the branch history 
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information is updated. 

Here, suppose that the instruction equivalent to 
a subroutine call, which specifies "14" as the number 
of the link register, and the instruction equivalent 
5 to a subroutine return, which specifies "14" as the 
number of the branch destination address register 
always make a call/return instruction pair. In this 
case, the correspondence between the call and the 
return instructions can be extracted without using 

10 the link stack 33. 

Therefore, the push and the pop operation 
signals are suppressed by using the signal 
SBRTN_LINK_REQ_EQ_E in order not to operate the link 
stack 33 in such a case. As a result, the entries of 

15 the link stack 33 can be prevented from being wasted, 
thereby realizing efficient operations even with a 
fewer number of stages. 

Then, an AND circuit 114 outputs the logical 
product of the signal +BRHIS_UPDATE_CALL_RTN_REG<0 : 3> 

20 shown in Fig. 5, and an operation signal 
+PUSH_RTN_STACK_LINK_REG. An AND circuit 115 outputs 
the logical product of the signal 
+RTN_LINK_REGSTK1<0:3> and an operation signal 
+POP_RTN_STACK_LINK_REG . 

25 An OR circuit 116 outputs the logical sum of the 
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output signals of the AND circuits 114 and 115 as a 
signal +SETRTN_LINK_REG_STKO<0 : 3> . This signal 
represents the register number set in the top entry 
of the link stack 33 . 
5 Here, the operation signals 

+ PUS H_RTN_S T ACK_L I NK_REG and +POP_RTN_STACK_LINK_REG 
never become a logic "1" at the same time. Therefore, 
the OR circuit 116 selectively outputs the output 
signals of the AND circuits 114 and 115. Accordingly, 

10 with the push operation, the number of the link 
register, which is specified by an instruction 
equivalent to a subroutine call, is set. In the 
meantime, with the pop operation, the register number 
stored in the second entry of the link stack 33 is 

15 set. 

Besides, an AND circuit 117 outputs the logical 
product of the signal +RTN_LINK_REG_STKO < 0 : 3 > and the 
operation signal +PUSH_RTN_STACK_LINK_REG as a signal 
+SET_RTN_LINK_REG_STK1<0:3>. This signal represents 

20 the register number set in the second entry of the 
link stack 33. In the push operation, this number 
matches the register number stored in the top entry 
of the link stack 33. 

Fig. 13 shows latch circuits storing a register 

25 number within the link stack 33. In this figure, an 



44 

input signal -PUSH J?OP_RTN_LINK_REG_STK becomes a 
logic "1" upon termination of the push or the pop 
operation. 

A latch circuit 121 latches the signal 
5 +SET_RTN_LINK_REG_STK0<0:3> as the top entry, and 
outputs the latched signal as the signal 
+RTN_LINK_REG_STK0<0:3> shown in Fig. 12. In the 
meantime, a latch circuit 122 latches the signal 
+SET_RTN LINK_REG_STK1<0:3> shown in Fig. 12 as the 

10 second entry, and outputs the latched signal as the 
signal +RTN_LINK_REG_STK1<0:3> shown in Fig. 12. 

When the signal -PUSH_POP_RTN_LINK_REG_STK 
becomes a logic "1", the registration of the register 
numbers to these entries is terminated, and the 

15 registered register numbers are held until this 
signal becomes a logic "0". 

Meanwhile, the above described lpsw instruction 
(complicated instruction) can possibly be either of 
subroutine call and return instructions. Therefore, 

20 this instruction is considered to possibly cause an 
improper correspondence between a call and a return. 
Or, if an interrupt occurs and if it is an interrupt 
of the type which does not return to an original 
program after the interrupt is processed, this 

25 interrupt is also considered to cause an improper 
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correspondence between a call and a return. 

Accordingly, if such an event (instruction, 
interrupt, etc. ) occurs, all of the entries of the 
link stack 33 and the return address stack 35 are 
5 cleared and the stored information are invalidated at 
the time of the execution of the instruction or the 
interrupt . 

Fig. 14 shows an invalidating circuit within the 
branch predicting mechanism 22. In this figure, an 

10 input signal +MICRO_PURGE_RTN_ADRS_STK is a signal 
which clears the entries of the link stack 33 and the 
return address stack 35. This signal becomes a logic 
"1" when an instruction or an interrupt, which can 
possibly cause an improper correspondence between a 

15 call and a return, occurs. 

A NOR circuit 131 outputs the negation of the 
logical sum of the operation signals 
+PUSH_RTN_STACK_LINK_REG and +POP_RTN_STACK_LINK_REG, 
which are shown in Fig. 12, and a signal 

20 +MICRO_PURGE_RTN_ADRS_STK as the signal 
PUSH_POP_RTN_LINK_REG_STK shown in Fig. 13. 

Accordingly, if the signal +MICRO_PURGE_ADRS_STK 
becomes a logic "1", the signal 
PUSHPOP_RTN_LINK_REG_STK becomes a logic "0", so 

25 that the register numbers stored by the latch 
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circuits 121 and 122 shown in Fig. 13 are cleared. 

Furthermore, when an instruction equivalent to 
a subroutine return, which does not return to a 
return destination corresponding to a subroutine 
5 call, that is, the instruction address immediately 
succeeding an instruction equivalent to a subroutine 
call, is recognized, a flag indicating that the 
return destination of the instruction equivalent to 
the subroutine return differs can be set in the 

10 branch history 34. 

Fig. 15 shows the circuit generating such a flag 
in the RSBR 36. In this figure, an input signal +D_BC 
becomes a logic "1" when an operation code "be" is 
detected by the decoder 23. An input signal 

15 D_DISP_EQ_0 becomes a logic "1" if the displacement 
specified by an instruction is not "0". 

Additionally, input signals +D_BR_EQ_E and 
+D_XR_EQ_E become a logic "1" respectively when the 
numbers of base and index registers specified by 

20 instructions are "14". These signals are output from 
the decoder 23 to the RSBR 36. 

An OR circuit 141 outputs the signal 
representing the logical sum of the signals 
+D_BR_EQ_E and +D_XR_EQ_E . An AND circuit 142 outputs 

25 the logical product of the signals +D_BC and 
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D_DISP_EQ_0, and the output signal of the OR circuit 
141 as a signal +D_BC_GIDDY_RTN. 

A latch circuit 143 latches the signal 
+ D_BC_G I DDYRT N from the OR circuit 141, and outputs 
5 the latched signal as a signal +RSBR_BC_GIDDY_RTN. 
This signal is held by the latch circuit 143 while 
the corresponding RSBR 36 is valid, and is used as a 
flag indicating that the return destination of an 
instruction equivalent to a subroutine return 

10 differs. 

This flag + RS BR BC G I DD Y_RT N is transmitted to 
the branch predicting mechanism 22 as a signal 
+ BRH I S_UPD ATE_BC_G I DD YRT N , and is set in a flag 
GIDDY RTN in the entry of the branch history 34 as 

15 shown in Fig. 16. 

The entry in the branch history 34 shown in Fig. 
16 stores a branch instruction address IAR, a branch 
destination address TIAR, and flags CALL and RTN in 
addition to the flag GIDDY RTN. The flags CALL and 

20 RTN respectively correspond to a subroutine call flag 
and a subroutine return flag. 

For example, if a branch instruction "be m. 
d(14)" the displacement of which is not "0" is 
decoded, the signal +D_BC_GIDDY_RTN becomes a logic 

25 "1", so that the flag +RSBR_BC_GIDDY_RTN is set. 
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Accordingly, when this branch instruction is 
registered to the branch history 34, a logic "1" is 
stored in the corresponding flag GIDDY RTN. 

If this flag GIDDY RTN is set at the time of the 
5 branch prediction made by the predicting circuit 31, 
the return address stack 35 performs a pop operation 
similar to that at the time of the prediction of a 
return instruction. However, the predicting circuit 
31 outputs not the branch destination address popped 

10 from the return address stack 35, but the branch 
destination address registered to the branch history 
34 as a predicted branch destination address. 
Accordingly, the instruction at the branch 
destination predicted by the branch history 34 is 

15 fetched, and the result of the prediction made by the 
return address stack 35 is discarded. 

In the above described preferred embodiment, by 
making a comparison between the number of the link 
register registered to the link stack 33 and that of 

20 the branch destination address register in an 
executed instruction (or an instruction to be 
executed), whether or not this instruction is an 
instruction equivalent to a subroutine return is 
determined. As another preferred embodiment other 

25 than the above described one, a similar determination 
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may be made by making a comparison between -the return 
address registered to the return address stack 35 and 
the branch destination address of an executed 
instruction (or an instruction to be executed) 
5 without using the link stack 33 . 

With this method, when an instruction equivalent 
to a return, which does not return to the instruction 
immediately succeeding the corresponding call 
instruction, such as the above described be 

10 instruction, etc., appears, the correspondence of a 
call /return pair to be recognized becomes improper, 
so that the performance inherent in the return 
address stack 35 is not fully utilized. However, this 
method has an advantage that there is no need to 

15 newly arrange the link stack 33. 

Fig. 17 shows the circuit which makes such a 
determination within the branch predicting mechanism 
22. In this figure, a signal +BRHISJJPDATETIAR 
represents the branch destination address of an 

20 instruction which can possibly be an instruction 
equivalent to a subroutine return, and is transmitted 
from the RSBR 36. 

A comparing circuit 151 makes a comparison 
between this signal + BRH I S_UPDATE_T I AR and the top 

25 entry (entry 0) of the return address stack 35, and 
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outputs the signal of the logic "0" if they match. 
Here, the return address stack 35 is illustrated as 
a stack having "n" stages. An AND circuit 152 outputs 
the logical product of the signal 
5 + BRH I S_UPDAT E_SUBR0UT I NE_RTN in Fig. 5 and the output 
signal of the comparing circuit 151 as the signal 
+BR_COMP_SUBROUTINE_RTN shown in Fig. 12. 

The determining circuit shown in Fig. 17 can 
possibly be a substitute for the determining circuit 

10 for an instruction equivalent to a subroutine return, 
which is shown in Fig. 11, and can generate a 
subroutine return flag without referencing the 
entries of the link stack 33. Accordingly, the link 
stack 33 becomes unnecessary in this case. 

15 In the above described preferred embodiments, 

the link stack 33 and the return address stack 35 are 
mainly assumed to be stacks having two stages. 
However, a similar control can be performed also when 
stacks having an arbitrary number of stages are used. 

20 Furthermore, a subroutine call/return instruction 
pair can be recognized by comparing arbitrary 
information specifying the return address of a 
subroutine, except for a register number or an 
instruction address. 

25 According to the present invention, a correct 
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subroutine call/return instruction pair can be 
dynamically extracted in an information processing 
device having a branch predicting mechanism such as 
a return address stack, etc. Accordingly, an improper 
correspondence of a call/return pair in the branch 
predicting mechanism can be prevented, thereby 
improving the accuracy of the branch prediction of an 
instruction equivalent to a subroutine return. 
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What is claimed is: 

1. A branch predicting device, comprising: 

a storing circuit storing information specifying 
5 a return address of a subroutine when an instruction 
equivalent to a subroutine call is detected; 

a comparing circuit making a comparison between 
information specifying a branch destination address 
of an instruction which can possibly be an 
10 instruction equivalent to a subroutine return and the 
information specifying the return address stored in 
said storing circuit, and outputting a result of the 
comparison, when the instruction which can possibly 
be the instruction equivalent to the subroutine 
15 return is detected; and 

an identifying circuit identifying an 
instruction equivalent to a subroutine return, which 
corresponds to the instruction equivalent to the 
subroutine call, based on the result of the 
20 comparison. 

2. The branch predicting device according to 
claim 1 , wherein 

said storing circuit stores a register number of 
25 a link register, which is specified by the 
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instruction equivalent to the subroutine call, as the 
information specifying the return address. 



3 . The branch predicting device according to 
5 claim 1, wherein 

said storing circuit stores the return address 
of the subroutine as the information specifying the 
return address. 

10 4. A branch predicting device, comprising: 

a stack circuit storing information specifying 
a return address of a subroutine; 

a push circuit pushing the information 
specifying the return address onto said stack 

15 circuit, when an instruction equivalent to a 
subroutine call is detected; 

a comparing circuit making a comparison between 
information specifying a branch destination address 
of an instruction which can possibly be an 

20 instruction equivalent to a subroutine return and the 
information specifying the return address stored in 
a top entry of said stack circuit, and outputting a 
result of the comparison, when the instruction which 
can possibly be the instruction equivalent to the 

25 subroutine return is detected; and 
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an identifying circuit identifying an 
instruction equivalent to a subroutine return, which 
corresponds to the instruction equivalent to the 
subroutine call, based on the result of the 
5 comparison. 

5 . The branch predicting device according to 
claim 4, wherein: 

said push circuit pushes a register number of a 
10 link register, which is specified by the instruction 
equivalent to the subroutine call, onto said stack 
circuit as the information specifying the return 
address; 

said comparing circuit makes a comparison 
15 between a register number of a branch destination 
address register, which is specified by the 
instruction which can possibly be the instruction 
equivalent to the subroutine return, and a register 
number stored in the top entry of said stack circuit; 
20 and 

said identifying circuit identifies the 
instruction which can possibly be the instruction 
equivalent to the subroutine return as the 
instruction equivalent to the subroutine return when 
25 the compared register numbers match. 
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6. The branch predicting device according to 
claim 5, wherein 

said identifying circuit identifies the 
instruction which can possibly be the instruction 
5 equivalent to the subroutine return as the 
instruction equivalent to the subroutine return 
regardless of the result of the comparison, if the 
register number of the branch destination address 
register corresponds to a particular register. 

10 

7 . The branch predicting device according to 
claim 5, wherein 

said push circuit does not push the register 
number of the link register onto said stack circuit 
15 if the register number of the link register 
corresponds to a particular register. 

8. The branch predicting device according to 
claim 4, further comprising 

20 a pop circuit popping said stack circuit when 

said identifying circuit identifies the instruction 
which can possibly be the instruction equivalent to 
the subroutine return as the instruction equivalent 
to the subroutine return, and a branch by the 

25 instruction equivalent to the subroutine return is 
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taken. 

9 . The branch predicting device according to 
claim 1, further comprising 

5 a predicting circuit storing branch history 

information for a branch prediction, wherein 

said comparing circuit makes the comparison 
between the information specifying the branch 
destination address and the information specifying 
10 the return address, when the branch history 
information is registered to said predicting circuit. 

10. The branch predicting device according to 
claim 1, further comprising 

15 a circuit invalidating the information stored in 

said storing circuit when an event which causes a 
correspondence between a subroutine call and a 
subroutine return to be improper. 

20 11. The branch predicting device according to 

claim 1, further comprising: 

a predicting circuit storing branch history 
information for a branch prediction; and 

a setting circuit setting in said predicting 
25 circuit a flag indicating that a return destination 
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of a detected instruction equivalent to a subroutine 
return differs, when an instruction equivalent to a 
subroutine return, which does not return to an 
instruction address immediately succeeding the 
5 instruction equivalent to the subroutine call, is 
detected . 

12. The branch predicting device according to 
claim 11, wherein 

10 said predicting circuit comprises a return 

address stack circuit storing the return address of 
the subroutine, pops said return address stack 
circuit if the flag is recognized at the time of a 
branch prediction, and does not use a popped return 

15 address as a predicted branch destination. 

13 . The branch predicting device according to 
claim 1, further comprising: 

a predicting circuit storing branch history 
20 information for a branch prediction; and 

a circuit performing a control such that a 
predetermined flag is set when an instruction 
equivalent to a subroutine call, which is 
unregistered to said predicting circuit, is detected, 
25 the predetermined flag is reset when an instruction 
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equivalent to a subroutine return, which corresponds 
to the unregistered instruction equivalent to the 
subroutine call, is detected, and the instruction 
equivalent to the subroutine return corresponding to 
5 the unregistered instruction is not identified as an 
instruction equivalent to a subroutine return in said 
predicting circuit. 

14. A branch predicting device, comprising: 

10 a return address stack circuit storing a return 

address of a subroutine when an instruction 
equivalent to a subroutine call is detected; 

a comparing circuit making a comparison between 
a branch destination address of an instruction which 

15 can possibly be an instruction equivalent to a 
subroutine return, and the return address stored in 
said return address stack circuit, and outputting a 
result of the comparison, when the instruction which 
can possibly be the instruction equivalent to the 

20 subroutine return is detected; and 

an identifying circuit identifying an 
instruction equivalent to a subroutine return, which 
corresponds to the instruction equivalent to the 
subroutine call, based on the result of the 

25 comparison. 
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15. A branch predicting method, comprising: 
registering information specifying a return 
address of a subroutine when an instruction 
equivalent to a subroutine call is detected; 
5 making a comparison between information 

specifying a branch destination address of an 
instruction which can possibly be an instruction 
equivalent to a subroutine return and the registered 
information specifying the return address, when the 

10 instruction which can possibly be the instruction 
equivalent to the subroutine return is detected; 

identifying the instruction which can possibly 
be the instruction equivalent to the subroutine 
return as an instruction equivalent to a subroutine 

15 return, which corresponds to the instruction 
equivalent to the subroutine call, if the information 
specifying the branch destination address and the 
information specifying the return address match; 

identifying the instruction which can possibly 

20 be the instruction equivalent to the subroutine 
return not as the instruction equivalent to the 
subroutine return, which corresponds to the 
instruction equivalent to the subroutine call, if the 
information specifying the branch destination address 

25 and the information specifying the return address do 
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not match; and 

making a branch prediction by using an 
identification result. 



5 16. A branch predicting device, comprising: 

storing means for storing information specifying 
a return address of a subroutine when an instruction 
equivalent to a subroutine call is detected; 

comparing means for making a comparison between 
10 information specifying a branch destination address 
of an instruction which can possibly be an 
instruction equivalent to a subroutine return and the 
information specifying the return address stored in 
said storing means, and for outputting a result of 
15 the comparison, when the instruction which can 
possibly be the instruction equivalent to the 
subroutine return is detected; and 

identifying means for identifying an instruction 
equivalent to a subroutine return, which corresponds 
20 to the instruction equivalent to the subroutine call, 
based on the result of the comparison. 



25 



17. A branch predicting device, comprising: 
stack means for storing information specifying 
a return address of a subroutine; 
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push means for pushing the information 
specifying the return address onto said stack means, 
when an instruction equivalent to a subroutine call 
is detected; 

5 comparing means for making a comparison between 

information specifying a branch destination address 
of an instruction which can possibly be an 
instruction equivalent to a subroutine return and the 
information specifying the return address stored in 

10 a top entry of said stack means, and for outputting 
a result of the comparison, when the instruction 
which can possibly be the instruction equivalent to 
the subroutine return is detected; and 

identifying means for identifying an instruction 

15 equivalent to a subroutine return, which corresponds 
to the instruction equivalent to the subroutine call, 
based on the result of the comparison. 

18. A branch predicting device, comprising: 
20 return address stack means for storing a return 

address of a subroutine when an instruction 
equivalent to a subroutine call is detected; 

comparing means for making a comparison between 
a branch destination address of an instruction which 
25 can possibly be an instruction equivalent to a 
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subroutine return, and the return address stored in 
said return address stack means, and for outputting 
a result of the comparison, when the instruction 
which can possibly be the instruction equivalent to 
the subroutine return is detected; and 

identifying means for identifying an instruction 
equivalent to a subroutine return, which corresponds 
to the instruction equivalent to the subroutine call, 
based on the result of the comparison. 



63 

Abstract of the Disclosure 



A register number of a link register, which is 
specified by an instruction equivalent to a 
5 subroutine call, is registered. The number of a 
branch destination register in a branch instruction 
which can possibly be an instruction equivalent to a 
subroutine return is compared with the registered 
register number. If they match, this branch 
10 instruction is identified as an instruction 
equivalent to a subroutine return. 
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